Arabic Text Classification Framework Based on Latent Dirichlet Allocation

نویسندگان
چکیده

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Arabic Text Classification Framework Based on Latent Dirichlet Allocation

Current research usually adopts Vector Space Model to represent documents in Text Classification applications. In this way, document is coded as a vector of words; n-grams. These features cannot indicate semantic or textual content; it results in huge feature space and semantic loss. The proposed model in this work adopts a “topics” sampled by LDA model as text features. It effectively avoids t...

متن کامل

Semi-supervised Latent Dirichlet Allocation for Multi-label Text Classification

This paper proposes a semi-supervised latent Dirichlet allocation (ssLDA) method, which differs from the existing supervised topic models for multi-label classification in mainly two aspects. Firstly both labeled and unlabeled learning data are used in ssLDA to train a model, which is very important for reducing the cost by manually labeling, especially when obtaining a fully labeled dataset is...

متن کامل

Multi - label Classification Algorithm Based on Latent Dirichlet Allocation Model

Vector Space Model (VSM) is used frequently in Text Classification (TC). However, it is usually produces a high dimensional feature space which leads to huge cost of computation and storage. Recently, statistic topic model plays an important role in the field of Information Retrieval (IR), TC and Document Clustering. In this chapter, we try to use a kind of statistic model—Latent Dirichlet Allo...

متن کامل

Aurora Image Classification Based on Multi-Feature Latent Dirichlet Allocation

Due to the rich physical meaning of aurora morphology, the classification of aurora images is an important task for polar scientific expeditions. However, the traditional classification methods do not make full use of the different features of aurora images, and the dimension of the description features is usually so high that it reduces the efficiency. In this paper, through combining multiple...

متن کامل

Similarity Measures Based on Latent Dirichlet Allocation

We present in this paper the results of our investigation on semantic similarity measures at wordand sentence-level based on two fully-automated approaches to deriving meaning from large corpora: Latent Dirichlet Allocation, a probabilistic approach, and Latent Semantic Analysis, an algebraic approach. The focus is on similarity measures based on Latent Dirichlet Allocation, due to its novelty ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Journal of Computing and Information Technology

سال: 2012

ISSN: 1330-1136

DOI: 10.2498/cit.1001770